This paper presents a method for detection and recognition of traffic signs based on information extracted from an event camera. The solution used a FireNet deep convolutional neural network to reconstruct events into greyscale frames. Two YOLOv4 network models were trained, one based on greyscale images and the other on colour images. The best result was achieved for the model trained on the basis of greyscale images, achieving an efficiency of 87.03%.
translated by 谷歌翻译
This paper proposes the use of an event camera as a component of a vision system that enables counting of fast-moving objects - in this case, falling corn grains. These type of cameras transmit information about the change in brightness of individual pixels and are characterised by low latency, no motion blur, correct operation in different lighting conditions, as well as very low power consumption. The proposed counting algorithm processes events in real time. The operation of the solution was demonstrated on a stand consisting of a chute with a vibrating feeder, which allowed the number of grains falling to be adjusted. The objective of the control system with a PID controller was to maintain a constant average number of falling objects. The proposed solution was subjected to a series of tests to determine the correctness of the developed method operation. On their basis, the validity of using an event camera to count small, fast-moving objects and the associated wide range of potential industrial applications can be confirmed.
translated by 谷歌翻译
近年来,事件摄像机(DVS - 动态视觉传感器)已在视觉系统中用作传统摄像机的替代或补充。它们的特征是高动态范围,高时间分辨率,低潜伏期和在有限的照明条件下可靠的性能 - 在高级驾驶员辅助系统(ADAS)和自动驾驶汽车的背景下,参数尤为重要。在这项工作中,我们测试这些相当新颖的传感器是否可以应用于流行的交通标志检测任务。为此,我们分析事件数据的不同表示:事件框架,事件频率和指数衰减的时间表面,并使用称为FireNet的深神经网络应用视频框架重建。我们将深度卷积神经网络Yolov4用作检测器。对于特定表示,我们获得了86.9-88.9%map@0.5的检测准确性。使用融合所考虑的表示形式的使用使我们能够获得更高准确性的检测器89.9%map@0.5。相比之下,用Firenet重建的框架的检测器的特征是52.67%map@0.5。获得的结果说明了汽车应用中事件摄像机的潜力,无论是独立传感器还是与典型的基于框架的摄像机密切合作。
translated by 谷歌翻译
本文介绍了在异质SOC FPGA计算平台上实施的无人机(UAV)控制算法的硬件(HIL)模拟系统。使用了在PC上运行的Airsim模拟器和带有来自AMD Xilinx的Zynq Soc芯片的Arty Z7开发板。通信是通过串行USB链接进行的。选择了在特殊标记的着陆条上自动着陆的申请作为案例研究。在Zynq SoC平台上实施了着陆点检测算法。这样可以实时处理1280 x 720 @ 60 fps视频流。执行的测试表明,该系统正常工作,并且没有可能对控制的稳定性产生负面影响。所提出的概念的特征是相对简单和实施成本较低。同时,它可以应用于在嵌入式平台上实现的无人机测试各种类型的高级感知和控制算法。我们提供在GitHub上开发的代码,该代码包括在PC上运行的Python脚本和在Arty Z7上运行的C代码。
translated by 谷歌翻译
神经形态视觉是一个快速增长的领域,在自动驾驶汽车的感知系统中有许多应用。不幸的是,由于传感器的工作原理,事件流中有很大的噪声。在本文中,我们提出了一种基于IIR滤波器矩阵的新算法,用于过滤此类噪声和硬件体系结构,该算法允许使用SOC FPGA加速。我们的方法具有非常好的过滤效率,无法相关噪声 - 删除了超过99%的嘈杂事件。已经对几个事件数据集进行了测试,并增加了随机噪声。我们设计了硬件体系结构,以减少FPGA内部BRAM资源的利用。这使得每秒的潜伏期非常低,最多可达3858元MERP的事件。在模拟和Xilinx Zynx Zynx Ultrascale+ MPSOC+ MPSOC芯片上,拟议的硬件体系结构在Mercury+ XU9模块上进行了验证。
translated by 谷歌翻译
Petrov-Galerkin formulations with optimal test functions allow for the stabilization of finite element simulations. In particular, given a discrete trial space, the optimal test space induces a numerical scheme delivering the best approximation in terms of a problem-dependent energy norm. This ideal approach has two shortcomings: first, we need to explicitly know the set of optimal test functions; and second, the optimal test functions may have large supports inducing expensive dense linear systems. Nevertheless, parametric families of PDEs are an example where it is worth investing some (offline) computational effort to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters, in an online stage. Therefore, as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions (in a basis expansion) associated with that PDE-parameter. Next, as a remedy for the second shortcoming, we use the low-rank approximation to hierarchically compress the (non-square) matrix of coefficients of optimal test functions. In order to accelerate this process, we train a neural network to learn a critical bottleneck of the compression algorithm (for a given set of PDE-parameters). When solving online the resulting (compressed) Petrov-Galerkin formulation, we employ a GMRES iterative solver with inexpensive matrix-vector multiplications thanks to the low-rank features of the compressed matrix. We perform experiments showing that the full online procedure as fast as the original (unstable) Galerkin approach. In other words, we get the stabilization with hierarchical matrices and neural networks practically for free. We illustrate our findings by means of 2D Eriksson-Johnson and Hemholtz model problems.
translated by 谷歌翻译
The celebrated proverb that "speech is silver, silence is golden" has a long multinational history and multiple specific meanings. In written texts punctuation can in fact be considered one of its manifestations. Indeed, the virtue of effectively speaking and writing involves - often decisively - the capacity to apply the properly placed breaks. In the present study, based on a large corpus of world-famous and representative literary texts in seven major Western languages, it is shown that the distribution of intervals between consecutive punctuation marks in almost all texts can universally be characterised by only two parameters of the discrete Weibull distribution which can be given an intuitive interpretation in terms of the so-called hazard function. The values of these two parameters tend to be language-specific, however, and even appear to navigate translations. The properties of the computed hazard functions indicate that among the studied languages, English turns out to be the least constrained by the necessity to place a consecutive punctuation mark to partition a sequence of words. This may suggest that when compared to other studied languages, English is more flexible, in the sense of allowing longer uninterrupted sequences of words. Spanish reveals similar tendency to only a bit lesser extent.
translated by 谷歌翻译
Recent advances in visual representation learning allowed to build an abundance of powerful off-the-shelf features that are ready-to-use for numerous downstream tasks. This work aims to assess how well these features preserve information about the objects, such as their spatial location, their visual properties and their relative relationships. We propose to do so by evaluating them in the context of visual reasoning, where multiple objects with complex relationships and different attributes are at play. More specifically, we introduce a protocol to evaluate visual representations for the task of Visual Question Answering. In order to decouple visual feature extraction from reasoning, we design a specific attention-based reasoning module which is trained on the frozen visual representations to be evaluated, in a spirit similar to standard feature evaluations relying on shallow networks. We compare two types of visual representations, densely extracted local features and object-centric ones, against the performances of a perfect image representation using ground truth. Our main findings are two-fold. First, despite excellent performances on classical proxy tasks, such representations fall short for solving complex reasoning problem. Second, object-centric features better preserve the critical information necessary to perform visual reasoning. In our proposed framework we show how to methodologically approach this evaluation.
translated by 谷歌翻译
Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text, while varying other factors such as model size and adversarial conditions. We test both "heuristic" mitigations (those without formal privacy guarantees) and Differentially Private training, which provides provable levels of privacy at the cost of some model performance. Our experiments show that (with the exception of L2 regularization), heuristic mitigations are largely ineffective in preventing memorization in our test suite, possibly because they make too strong of assumptions about the characteristics that define "sensitive" or "private" text. In contrast, Differential Privacy reliably prevents memorization in our experiments, despite its computational and model-performance costs.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译